LTLf /LDLf Non-Markovian Rewards

نویسندگان

Ronen I. Brafman

Giuseppe De Giacomo

Fabio Patrizi

چکیده

In Markov Decision Processes (MDPs), the reward obtained in a state is Markovian, i.e., depends on the last state and action. This dependency makes it difficult to reward more interesting long-term behaviors, such as always closing a door after it has been opened, or providing coffee only following a request. Extending MDPs to handle non-Markovian reward functions was the subject of two previous lines of work. Both use LTL variants to specify the reward function and then compile the new model back into a Markovian model. Building on recent progress in temporal logics over finite traces, we adopt LDLf for specifying non-Markovian rewards and provide an elegant automata construction for building a Markovian model, which extends that of previous work and offers strong minimality and compositionality guarantees.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Non-Markovian Rewards Expressed in LTL: Guiding Search Via Reward Shaping

We propose an approach to solving Markov Decision Processes with non-Markovian rewards specified in Linear Temporal Logic interpreted over finite traces (LTLf ). Our approach integrates automata representations of LTLf formulae into compiled MDPs that can be solved by off-the-shelf MDP planners, exploiting reward shaping to help guide search. Experiments with state-of-the-art UCT-based MDP plan...

متن کامل

LTLf and LDLf Monitoring: A Technical Report

Runtime monitoring is one of the central tasks to provide operational decision support to running business processes, and check on-the-fly whether they comply with constraints and rules. We study runtime monitoring of properties expressed in LTL on finite traces (LTLf ) and in its extension LDLf . LDLf is a powerful logic that captures all monadic second order logic on finite traces, which is o...

متن کامل

LTLf and LDLf Synthesis under Partial Observability

In this paper, we study synthesis under partial observability for logical specifications over finite traces expressed in LTLf /LDLf . This form of synthesis can be seen as a generalization of planning under partial observability in nondeterministic domains, which is known to be 2EXPTIMEcomplete. We start by showing that the usual “belief-state construction” used in planning under partial observ...

متن کامل

Linear Temporal Logic and Linear Dynamic Logic on Finite Traces

In this paper we look into the assumption of interpreting LTL over finite traces. In particular we show that LTLf , i.e., LTL under this assumption, is less expressive than what might appear at first sight, and that at essentially no computational cost one can make a significant increase in expressiveness while maintaining the same intuitiveness of LTLf interpreted over finite traces. Indeed, w...

متن کامل

Monitoring Business Metaconstraints Based on LTL & LDL for Finite Traces

Runtime monitoring is one of the central tasks to provide operational decision support to running business processes, and check on-the-fly whether they comply with constraints and rules. We study runtime monitoring of properties expressed in LTL on finite traces (LTLf ) and its extension LDLf . LDLf is a powerful logic that captures all monadic second order logic on finite traces, which is obta...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

LTLf /LDLf Non-Markovian Rewards

نویسندگان

چکیده

منابع مشابه

Non-Markovian Rewards Expressed in LTL: Guiding Search Via Reward Shaping

LTLf and LDLf Monitoring: A Technical Report

LTLf and LDLf Synthesis under Partial Observability

Linear Temporal Logic and Linear Dynamic Logic on Finite Traces

Monitoring Business Metaconstraints Based on LTL & LDL for Finite Traces

عنوان ژورنال:

اشتراک گذاری